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SCIENTIFIC  NOTE 

VALIDATION  OF  ECOLOGICAL  NICHE  MODELS  FOR  POTENTIAL 
MALARIA  VECTORS  IN  THE  REPUBLIC  OF  KOREA 

DESMOND  H.  FOLEY,1  TERRY  A.  KLEIN,2  HEUNG  CHUL  KIM,3  TRACY  BROWN,1 
RICHARD  C.  WILKERSON1  and  LEOPOLDO  M.  RUEDA1 

ABSTRACT.  Data  on  molecularly  identified  adult  and  larval  mosquitoes  collected  from  104  sites  from 
the  Republic  of  Korea  (ROK)  in  2007  were  used  to  test  the  predictive  ability  of  recently  reported  ecological 
niche  models  (ENMs)  for  8  potential  malaria  vectors.  The  ENMs,  based  on  the  program  Maxent  and  the 
least  presence  threshold  criterion,  predicted  100%  of  new  collection  locations  for  Anopheles  sinensis.  An. 
helenrae.  An.  pullus,  and  An.  sineroides;  96%  of  locations  for  An.  kleini',  and  83%  for  An.  lesteri ,  but  were 
relatively  unsuccessful  for  the  infrequently  collected  non-Hyrcanus  group  species  An.  koreicus  and  An. 
lindesayi  japonicas.  The  ENMs  produced  with  the  use  of  Maxent  had  fewer  omission  errors  than  those  using 
the  Genetic  Algorithm  for  Rule-Set  Prediction  program.  The  results  emphasize  the  importance  of 
independent  test  data  for  validation  and  improvement  of  ENMs.  and  lend  support  for  the  further 
development  of  ENMs  for  predicting  the  distribution  of  malaria  vectors  in  the  ROK. 
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Since  1993,  vivax  malaria  has  become  an 
annual  threat  to  military  personnel  and  civilians 
in  the  Republic  of  Korea  (ROK — South  Korea), 
especially  along  the  demilitarized  zone  (DMZ) 
separating  the  ROK  from  the  Democratic  Peo¬ 
ple’s  Republic  of  Korea  (DPRK — North  Korea) 
(Kho  et  al.  1999,  Park  et  al.  2003,  Ciminera  and 
Brundage  2006,  Han  et  al.  2006,  Yeom  et  al. 
2007,  Kim  et  al.  2009,  Klein  et  al.  2009).  To 
identify  suitable  areas  for  malaria  transmission, 
we  developed  ecological  niche  models  (ENMs)  to 
predict  the  distribution  of  the  8  candidate  malaria 
vector  species  reported  from  the  ROK  (Foley  et 
al.  2009).  These  species  are:  Anopheles  sinensis 
sensu  stricto  (s.s.)  Wiedemann,  An.  pullus  M. 
Yamada,  An.  lesteri  Baisas  and  Hu  ( =An . 
cinthropophagus).  An.  sineroides  S.  Yamada,  An. 
kleini  Rueda,  An.  helenrae  Rueda,  An.  lindesayi 
japonicus  S.  Yamada,  and  An.  koreicus  S. 
Yamada  and  Watanabe.  Because  of  the  lack  of 
morphological  markers,  a  polymerase  chain 
reaction  (PCR)  technique  (Li  et  al.  2005)  is 
required  to  separate  5  of  the  6  members  of  the 
Hyrcanus  group  found  in  the  ROK.  Two  of  these 
species,  An.  kleini  and  An.  helenrae,  have  only 
recently  been  recognized  (Rueda  2005),  necessi¬ 
tating  a  reassessment  of  information  about 
malaria  vectors  in  the  ROK.  Until  the  identity 
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of  the  malaria  vectors  are  established,  we  consider 
all  8  species  as  candidates,  although  results  from 
ENMs  suggest  that  An.  lindesayi  japonicus  and 
An.  koreicus  are  unlikely  to  be  malaria  vectors 
(Foley  et  al.  2009). 

Foley  et  al.  (2009)  used  the  Genetic  Algorithm 
for  Rule-Set  Prediction  (GARP)  (Stockwell  and 
Noble  1992)  and  a  maximum  entropy  approach 
(Maxent)  (Phillips  et  al.  2006)  for  ENMs  of 
Korean  mosquitoes.  In  certain  comparisons, 
Maxent  has  been  shown  to  outperform  GARP 
(Elith  et  al.  2006,  Phillips  et  al.  2006),  and 
Hernandez  et  al.  (2006)  and  Pearson  et  al.  (2007) 
found  that  Maxent  achieved  better  predictive 
success  rates  with  small  sample  sizes.  However, 
GARP  may  perform  better  at  extrapolation  for 
unsampled  areas  (i.e.,  transferability)  than  Maxent 
(Peterson  et  al.  2007,  but  see  Phillips  2008).  To 
reduce  ENMs  to  presence  and  absence  predictions, 
Foley  et  al.  (2009)  used  the  lowest  presence 
threshold  (LPT),  which  identifies  pixels  that  are 
at  least  as  suitable  as  the  lowest  value  associated 
with  a  species’  presence  (Pearson  et  al.  2007). 

We  are  interested  in  the  performance  of  ENMs 
because  quantification  of  the  area  of  vector-borne 
disease  risk  (Mal-area),  where  humans,  patho¬ 
gens,  and  vectors  potentially  overlap  (Foley  et  al. 
2008),  is  dependent  on  realistic  models  of  vector 
distribution.  Model  validation  is  often  an  integral 
part  of  ENM  development.  For  example,  the 
genetic  algorithm  in  GARP  allows  rules  applied 
to  training  data  to  evolve  to  maximize  predictive 
accuracy  based  on  test  data.  Extrinsic  test  data  set 
aside  from  model  development  can  also  be 
overlaid  onto  predicted  distributions  to  calculate 
omission  error  (percentage  of  independent  test 
points  not  predicted  by  the  model).  Extrinsic  test 
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Table  1.  Validation  of  the  ecological  niche  models  in  Foley  et  al.  (2009)  for  Anopheles  helenrae.  An.  kleini.  An. 
koreicus.  An.  lesteri.  An.  lindesayi  japonicus ,  An.  pi.il  I  us.  An.  sinensis ,  and  An.  sineroides  based  on  comparison  with 
new  collection  data.  The  number  of  collection  grids  that  were  correctly  predicted  positive  by  the  Genetic  Algorithm 
for  Rule-Set  Prediction  (GARP)  and  Maxent  models  for  each  species  are  shown,  along  with  the  omission  error  rate 

(in  parentheses). 


Species 

Specimens 

Locations1 

No.  grids 

Final  no.  grids 

GARP 

Maxent 

An.  belenrae 

58 

30 

15 

13 

8  (0.38) 

13  (0) 

An.  kleini 

1,378 

71 

26 

24 

21  (0.13) 

23  (0.04) 

An.  koreicus 

10 

10 

2 

2 

1  (0.50) 

0  (1.00) 

An.  lesteri 

38 

13 

8 

6 

2  (0.67) 

5  (0.17) 

An.  lindesayi 

358 

11 

7 

7 

2  (0.71) 

4  (0.43) 

An.  pullus 

418 

60 

19 

17 

16  (0.06) 

17  (0) 

An.  sinensis 

6,898 

84 

37 

30 

27  (0.10) 

30  (0) 

An.  sineroides 

412 

24 

10 

9 

7  (0.22) 

9  (0) 

1  Total  numbers  of  new  collection  locations  for  each  species  were  reduced  to  0.00833°  grids,  then  grids  that  occurred  in  “no  data" 
areas  of  the  models  and  that  overlapped  grid  point  positions  used  for  model  construction  were  removed,  resulting  in  the  final 
number  of  grids  for  each  species  that  were  available  for  comparison. 


data  can  be  a  random  subset  of  collection 
locations,  or  spatially  stratified  subsampling  of 
collection  points;  the  latter  is  seen  as  a  more 
rigorous  test  of  the  transferability  of  the  models 
(Peterson  et  al.  2007).  With  the  use  of  a 
subsample  of  test  data,  Foley  et  al.  (2009)  found 
that  An.  sinensis  had  the  best  support  among 
Maxent  models  for  species  with  larger  sample 
sizes,  according  to  random  expectations  of 
omission  error  for  the  study  area  predicted 
present.  For  species  with  smaller  sample  sizes 
(<25),  all  except  An.  lesteri ,  gave  better  omission 
error  than  random.  The  GARP  models  for 
species  with  larger  sample  sizes  indicated  that 
all  species  gave  better  omission  error  than 
random  but,  for  species  with  smaller  sample  sizes, 
only  An.  lesteri  and  An.  koreicus  gave  better 
omission  error  than  random. 

Additional  validation  can  be  conducted  by 
comparing  an  ENM  with  expert  opinion  about  a 
species’  distribution,  or  with  observations  about  a 
species’  distribution  from  the  literature.  A  less 
common,  but  perhaps  more  convincing  demon¬ 
stration  of  the  utility  of  ENMs,  is  to  test 
predictions  with  newly  collected  data.  Thus,  we 
aimed  to  test  the  accuracy  of  the  ENMs  devel¬ 
oped  by  Foley  et  al.  (2009)  by  comparing 
predictions  with  independent  mosquito  collection 
data  obtained  from  the  ROK  in  2007. 

The  numbers  of  adult  and  larval  specimens  and 
number  of  collection  locations  are  shown  in 
Table  1.  Species  were  identified  with  the  use  of 
PCR  as  described  by  Li  et  al.  (2005).  Many  of  the 
collection  sites  for  2007  were  close  to  one  another, 
so  points  were  converted  to  grids  of  0.00833° 
resolution  (1  km  at  the  equator)  that  matched  the 
grid  size  and  position  in  the  models  of  Foley  et  al. 
(2009).  For  example,  for  An.  sinensis,  this  process 
resulted  in  37  spatially  independent  grids  (see 
Table  1 ),  of  which  35  did  not  coincide  with  any  of 
the  80  grid  sites  for  this  species  used  in  Foley  et  al. 
(2009).  Five  of  these  35  grids  occurred  in  areas  of 


the  model  that  were  classified  as  “no  data,” 
resulting  in  30  grids  that  could  be  used  for  final 
comparison  (see  Table  1).  Of  these  new  An. 
sinensis  sites,  90%  were  predicted  positive  by  the 
GARP  model,  and  100%  by  the  Maxent  model  of 
Foley  et  al.  (2009).  The  Maxent  model  accurately 
predicted  new  locations  for  An.  sinensis,  even 
when  these  were  geographically  distant  front 
model  input  locations. 

The  Maxent  models  of  Foley  et  al.  (2009)  were 
updated  with  the  addition  of  collection  data  front 
2007  (Fig.  1).  The  addition  of  points  for  An. 
belenrae.  An.  kleini,  and  An.  pullus  resulted  in  a 
small  increase  in  the  areas  predicted  suitable  for 
these  species,  although  the  overall  distribution 
pattern  did  not  change  markedly.  For  An. 
sinensis,  the  addition  of  points  reduced  the  area 
predicted  suitable  for  this  species  from  54.6%  to 
46.3%,  although  the  overall  distribution  pattern 
did  not  change  markedly.  For  An.  sineroides,  the 
addition  of  points  (mainly  in  the  northwest  of  the 
ROK)  increased  the  area  predicted  suitable  for 
this  species  from  38.2%  to  40.8%.  For  An.  lesteri, 
the  addition  of  points  (mainly  in  the  southeast) 
increased  the  area  predicted  suitable  for  this 
species  front  4.4%  to  10.5%,  mainly  due  to  a 
southward  expansion.  For  An.  lindesayi  japonicus, 
the  addition  of  points  (mainly  in  the  center  and 
south  of  the  ROK)  resulted  in  a  contraction  in  the 
south  and  west,  and  an  extension  to  the  east.  For 
An.  koreicus,  the  addition  of  points  decreased  the 
area  predicted  suitable  for  this  species  from 
17.4%  to  15.1%,  and  resulted  in  an  expansion  in 
the  north  and  central  part  of  the  ROK,  and  a 
contraction  in  the  south  and  east. 

The  ENMs  produced  with  the  use  of  Maxent 
had  fewer  omission  errors  than  those  using 
GARP  under  the  conditions  of  this  study.  This 
finding  is  in  agreement  with  Elith  et  al.  (2006)  and 
other  authors,  who  observed  that  Maxent  out¬ 
performs  GARP,  especially  within  more  densely 
sampled  landscapes.  Whether  Maxent  models 
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Fig.  1.  Updated  ecological  niche  models  of  Anopheles  belenrae.  An.  kleini.  An.  koreicus.  An.  lesteri.  An.  lindesayi 
japonicus.  An.  pullus.  An.  sinensis,  and  An.  sineroides  in  the  Republic  of  Korea  with  the  use  of  the  program  Maxent 
and  new  collection  data  from  2007.  Figure  shows  the  areas  of  agreement  (Consensus)  and  disagreement  between  the 
current  (New)  models  and  those  of  Foley  et  al.  (2009)  (Old).  The  lowest  presence  threshold  criterion  was  used,  and 
new  collection  data  for  each  species  are  shown  by  circles. 
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developed  for  the  ROK  would  perform  better 
than  GARP  models  in  unsampled  areas  of 
Southeast  Asia  is  not  known.  We  hope  to  validate 
the  updated  ENMs  with  additional  sampling  that 
was  undertaken  in  the  ROK  in  2008. 

The  results  for  some  Maxent  models  suggest 
that  omission  levels  are  lower  than  those  reported 
by  Foley  et  al.  (2009).  These  authors  suggested 
that  among  Maxent  models  for  species  with  lower 
numbers  of  data  points,  only  the  An.  lesteri  model 
failed.  However,  we  found  that  models  for  An. 
koreicus  and  An.  lindesayi  japonicus  had  higher 
levels  of  omission  error  than  for  An.  lesteri.  One 
possible  reason  for  this  discrepancy  is  that 
omission  levels  in  Foley  et  al.  (2009)  were 
calculated  front  models  built  with  a  subset  of 
points,  whereas  we  tested  their  final  models  that 
were  built  on  total  available  data  points.  For 
species  with  lower  numbers  of  data  points, 
additional  collection  data  could  improve  the 
accuracy  of  ENMs. 

Species  habitat  model  accuracy  can  be  impor¬ 
tant  for  incriminating  the  vector  species,  for 
understanding  the  ecological  requirements  of 
species  based  on  satellite  data,  and  for  determin¬ 
ing  the  Mal-area.  We  show  that  many  of  the 
ENMs  of  Foley  et  al.  (2009)  accurately  predicts 
where  those  species  occur.  It  is  important  to  note 
that  these  models  say  nothing  about  the  epide- 
miologically  important  parameters  of  abundance 
and  longevity.  However,  we  believe  that  our 
results  lend  support  for  the  further  development 
of  ENMs  for  predicting  the  distribution  of 
mosquito  disease  vectors  in  the  ROK. 
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herein  are  those  of  the  authors  and  are  not  to 
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the  Department  of  the  Army  or  the  Department 
of  Defense. 
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